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Lessons from NCLB for the 
Every Student Succeeds Act 

William J. Mathis, University of Colorado Boulder 
Tina M. Trujillo, University of California Berkeley 


Executive Summary 

The No Child Left Behind Act was replaced by the Every Student Succeeds Act (ESSA) with 
great fanfare and enthusiasm. Granting more power to states and curbing what was seen as 
federal overreach was well received. Nevertheless, the new system remains a predominately 
test-based accountability system that requires interventions in the lowest scoring five per¬ 
cent of schools. The new law continues to disaggregate data by race and by wealth (and adds 
new sub-groups) but shows little promise of remedying the systemic under resourcing of 
needy students. Giving the reform policies of high-stakes assessment and privatization the 
benefit of the most positive research interpretation, the benefits accrued are insufficient to 
justify their use as comprehensive reform strategies. Less generous interpretations of the re¬ 
search provide clear warnings of harm. The research evidence over the past 30 years further 
tells us that unless we address the economic bifurcation in the nation, and the opportunity 
gaps in the schools, we will not be successful in closing the achievement gap. Although not 
strong enough to tip the balance, ESSA does provide states with a valuable new tool. School 
reports will now be incorporating one or more non-academic indicators that can help bring 
attention to the nation’s broader educational purposes. 

As state policymakers implement their revised programs, we offer the following recommen¬ 
dations on both broad and focused implementation issues: 

• Above all else, each state must ensure that students have adequate opportunities, 
funding and resources to achieve state goals. Funds must be available in an equi¬ 
table manner and must be sufficient to meet students’ needs. Schools and school 
personnel must not be evaluated on elements where they are denied the resources 
and supports they need to be successful. 1 

• States must shift toward an assistance role and exercise less of a regulatory role. 
States must assure that all students have equal access to high-quality teachers, 
stimulating curriculum and instruction and adequate school resources (such as 
computers, libraries, field trips, and learning resources). 2 

• Under ESSA, school performance will now be measured using a system that in¬ 
corporates one or more non-academic indicators—chosen separately by each state. 
These non-academic indicators provide states their strongest new tool for maxi¬ 
mizing educational equity and opportunity and bringing attention to the nation’s 
broader educational purposes. 

• States and districts must collaborate with social service and labor departments 
to ensure adequate personal, social and economic opportunities. Without a liv¬ 
able wage and adequate support services, social problems will be manifest in the 
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schools. Public and private schools must adopt assignment policies and practices 
that ensure integration and that disperse pockets of poverty. 3 

• Although President-elect Trump has called for expanding charter schools, the re¬ 
search evidence does not support expansion. The number of charter schools should 
be reduced. On average, charter schools do not perform at higher levels than pub¬ 
lic schools, yet they segregate, 4 remain prone to fiscal mismanagement, 5 and often 
have opaque management and accountability. 6 

• Development of multiple-measure and dashboard accountability approaches must 
be comprehensive, balanced between inputs and outcomes, expressed clearly, and 
assessed. As contrasted with a convenient collection of available data, the informa¬ 
tion must accurately and validly reflect the desired learning outcomes and the input 
resources needed to achieve these outcomes. 

• Standardized test scores must be used cautiously and only in combination with 
other data to avoid creating incentives for narrowed and distorted teaching and 
learning. Further, the weak technical strength of standardized assessments and val¬ 
ue-added models renders these approaches invalid for use in a high-stakes context. 7 

• The aggregation of data into a single score or grade should be avoided. Such proce¬ 
dures hide valuable information while invalidly combining disparate and unrelated 
objects. 8 

• States and school districts must train educators to conduct formative and construc¬ 
tive self-evaluations. The current emphasis on outcome-based evaluations does not 
capture the diverse universe of teaching. 

• States should establish, develop, train and implement school visitation teams that 
address both quantitative and qualitative factors. Sites most in need of improve¬ 
ment should be prioritized. Standardized test scores can be validly used to establish 
initial priorities. 9 

• External reviews should focus on providing guidance and capacity-building support 
for school development and improvement, rather than on imposing sanctions. 10 

• External reviewers should be qualified experts who meet prescribed standards. Ro¬ 
bust training should be compulsory, with retraining required on a periodic basis. 

• Multiple stakeholders (administrators, teachers, students, parents, community 
leaders, and researchers) should be involved in the design of the state’s evaluation 
or inspectorate program. 

• States should use the flexibility and the assessment pilot project alternatives in 
ESSA to test fewer grades. If local assessments are employed in the remaining 
grades, avoid attempting to equate different tools or develop growth scores with a 
potpourri of different instruments. The technique does not have sufficient technical 
power to justify such usage. 11 

• States and districts must apply more stringent criteria in adopting interventions. 
Many commercial presentations, packages, and “best practices” lack a scientific 
foundation. Peer-reviewed literature must be employed to vet promising practices. 
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A number of positive elements have also been illuminated and represent wise educational 
investments. The following five approaches are among the most important but these should 
not be viewed as a complete or exhaustive list. 

• Early education - The achievement gap is already a standard deviation wide by 
age three or four and does not decrease as children go through school. Thus, the im¬ 
perative is high-quality early education, which also has one of the highest rates of 
“return on investment.” Early education should concentrate on broad-based expe¬ 
riential learning. An emphasis on subject matter knowledge and formal assessment 
should be avoided until grades three or four. 12 

• Extended school year and day - Expanding learning time and using that addi¬ 
tional time for deep, engaging enrichment, either after school or during the sum¬ 
mer can be effective in closing the achievement gap. Again, the emphasis must be 
on high-quality and comprehensive programs as contrasted with low-substance 
and test prep approaches. 13 

• De-tracking - Tracking or “ability grouping” stratifies the learning opportunities 
of students inside of a school building, often segregating by race, ethnicity and so¬ 
cioeconomic status, thereby denying the most marginalized students a high-quality 
education. 14 

• Class size reduction -Smaller class sizes show great advantages, helping teach¬ 
ers teach and helping students learn, but these reforms invariably are revisited in 
times of fiscal constraint. 15 

• School-community partnerships - Particularly for children who live in plac¬ 
es where stable housing, employment and other opportunities are largely denied, 
the provision of health, social, medical and dental support becomes essential. 16 It 
is also particularly important for schools in these communities to develop strong, 
mutually respectful partnerships with parents and other community members. 
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Lessons from NCLB for the 
Every Student Succeeds Act 


If Lyndon Johnson were alive today, he would undoubtedly be discouraged to see what has 
become of the original Elementary and Secondary Education Act (ESEA) that he signed 
into law fifty years ago as a part of the War on Poverty. ESEA was deeply rooted in ideals of 
democracy and equity. It signified the federal government’s pledge to create equal educa¬ 
tional opportunity by increasing funding and school improvement resources for states and 
districts. The goal seemed simple: strengthen the capacity of our most economically impov¬ 
erished schools to provide high quality public education for all students. 

Despite this legislative commitment to public schools, our lawmakers have largely eroded 
ESEA’s original intent. Moving from assistance to ever-increasing regulation, states grav¬ 
itated toward test-based reforms in the minimum basic skills movement in the 1970s. A 
watershed event occurred in 1983 with the report, A Nation at Risk, which was predicat¬ 
ed on international economic competitiveness and rankings on test scores. The report was 
succeeded by Goals 2000, the first federal act to require states to develop standards-based 
test goals and measure progress toward them. The stringent and reductionist No Child Left 
Behind (NCLB) Act 0/2001 then followed on its heels. At each step, our educational policies 
became more test-based, top-down, prescriptive, narrow and punitive, and federal support 
to build the most struggling schools’ capacity for improvement faded. 

Most recently, on December 10th, 2015, amid much fanfare from both sides of the aisle, 
President Obama signed the Every Student Succeeds Act (ESSA), which reauthorized ESEA. 
These last two revisions of the federal legislation, NCLB and ESSA, have moved the country 
farther and farther away from the original principle behind ESEA, which was to use federal 
funding to increase protections for historically underserved students. It was originally a civil 
rights initiative. While ESSA shifted the accountability mechanisms to the states, this latest 
iteration of the law does not reflect what we know and what we need to ensure equal educa¬ 
tional opportunities for all children across the nation. 

Unfortunately, ESSA preserves most of the unproductive structures and reforms that NCLB 
prescribed. It is true that looming threats of certain sanctions - both to schools and edu¬ 
cators themselves - have been scaled back. Unattainable Adequate Yearly Progress targets 
no longer exist but are replaced by state sanctions on schools. States’ flexibility has been 
restored to look somewhat like the first-generation, 17 state-level systems that preceded (and, 
ironically, informed) NCLB. 

But at its core, ESSA is still a primarily test-based educational regime. Annual standardized 
testing in reading and math is still mandated in grades 3-8 and once in high school. Science 
testing at benchmark levels of schooling remains. The criteria for requiring schools to write 
improvement plans have been revised, yet standardized test scores continue to comprise the 
largest share of these criteria. Identification of schools in need of improvement continues to 
depend mostly on test scores, but now also includes one or more other academic and quality 
indicators. Formerly rigid prescriptions for school reforms have been relegated to districts 
and states, although the expanded range of potential reforms still encourages and funds 
charter schools and requires other NCLB-like “corrective actions.” State accountability sys- 
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terns must be federally approved and mechanisms such as turnaround-driven layoffs, con¬ 
versions to charter schools, and school closures are likely to continue even though they have 
not been proven to consistently improve schools in struggling communities. Punishments 
for continual low test-performance persist. The most substantial difference is that the power 
to decide which test-based consequences for under-performing schools resides once again in 
the states, not the federal government. 

In order for ESSA to achieve the kind of significant, equity-minded improvements that its 
original proponents imagined, state-level policymakers must be willing to significantly de¬ 
part from NCLB practices and norms. They will need to adopt a set of driving principles 
and aims for schools that have been nearly absent from the discourse on and practice of 
school reform for the past thirty years. We will need to return to investing in inputs for un¬ 
der-resourced schools and shift away from strictly monitoring performance outputs. This is 
a herculean task. An extensive body of research reminds us how our norms are powerfully 
entrenched about which communities deserve which resources and which learners are able 
to achieve at consistently high levels, 18 and practical experiences over the last three decades 
have socialized entire generations of policymakers, practitioners, and even some research¬ 
ers to accept the current manner of doing business in schools. 

In drawing lessons from the nation’s experiences, we must first examine the principles and 
purposes of education to see how they are reflected in the laws. The details of and changes 
in the statutes have been charted by a number of organizations. But stepping back from the 
comparative details, we must examine the broad research lessons from NCLB. The most 
over-arching questions surround poverty, the efficacy of reforms based on high-stakes as¬ 
sessment, and the effects of privatization. Also brought to the fore are topics such as the 
utility of multiple measures and the role of school self-evaluations. Following this compre¬ 
hensive review, we derive specific lessons to guide state and local policymakers in effective 
practices. 


I. NCLB and ESSA: Commonalities and Contrasts 

From a teacher’s point-of-view, the new law continues the basic operations and principles 
of the previous law: It fundamentally maintains a test-driven, top-down, remediate and 
penalize law. Despite the “too much testing” outcry, the same tests in the same grades are 
required while states are “allowed” to employ more exams. The main difference is that 
instead of federal mandates, the states are required to redefine and implement many of the 
same features previously required by the federal government. 

While states set standards, the law still requires the same performance levels, schools are 
held accountability for results, poorly performing schools are identified and schools “in 
need of improvement” must show progress in three years or be met with “more rigorous 
improvement actions.” 

Like NCLB, it is underfunded. Comprehensive improvement support is required but state 
and local willingness or capacity and the federal forecast are not promising. 

With overtones of a political grudge match, much of the Washington excitement surround¬ 
ing the law’s passage was based on curtailing the federal department’s authority. To a local 
district or school, however, it makes little difference whether the mandate comes from the 
federal government or the state government. The driving principles, sanctions and rewards 
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remain and, in most states, will be directed by the same people acting in much the same 
roles. 

Note must be taken of the civil rights groups’ reservations and concerns. Disaggregation of 
state testing data by race and socioeconomic level remains (and has been expanded) but the 
re-introduction of standard setting by states and accountability decision making will, over 
time, likely result in varied state expectations in goals, funding and technical support for 
improvement from one state to another. 19 


II. First-Order Lessons for ESSA 

“Where we sit determines what we see.” What we have learned differs by individual, orga¬ 
nizational affiliation and ideology. Some have argued that NCLB did not work because it 
was not pursued aggressively. In this thinking, we should “double down” on the previous 
strategies. But there are few observers who say NCLB worked or is workable. (Otherwise, the 
federal government would not have needed to issue waivers and the achievement gap would 
have closed). Nevertheless, while ideologically affiliated organizations invariably find re¬ 
sults supportive of their perspective, there is a mainstream research consensus on what we 
have learned: 


A. The opportunity gap - We cannot expect to close the achievement gap until 
we address the social and economic gaps that divide our society. 

No Child Left Behind had the explicit purpose of all children achieving high standards and 
thereby closing the achievement gap by 2014. It did not come close. 

Noting the widening academic achievement gap between rich and poor, Sean Riordan found 
the gap “roughly 20 to 40 percent larger among children born in 2001 than among those 
born 25 years earlier.” 20 The irony is that the very problem the law was supposed to fix be¬ 
came worse. 

In an economic and social shift, he reports that family income is now nearly as strong a 
predictor as parental education. The income achievement gap, which is closely tied to the 
racial gap, is attributable to income inequality, the increased difficulty of social mobility, the 
bifurcation of wages and the economy, and a narrowing of school purposes driven by test 
taking. 21 The racial gap was closing until the early 1990s - at the same time that test-based 
accountability was in its ascendancy in Goals 2000 and subsequently in NCLB. 22 Harris and 
Herrington attribute the earlier gains to the pre-1990 exposure of children of color to great¬ 
er learning resources and academic content. 23 

Low test scores are indicators of our social inequities, argue Berliner and Rothstein. Oth¬ 
erwise, we would not see our white and affluent children scoring at the highest levels in the 
world and our children of color scoring equivalent to third-world countries. 24 We also would 
not see our urban areas, with the lowest scores and greatest needs, funded well below our 
higher scoring suburban schools. 25 

With two-thirds of the variance in test scores attributable to environmental conditions, the 
best way of closing the opportunity gap is through providing jobs and livable wages across 
the board. We must also deal with governmentally determined housing patterns that segre- 
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gate our children. As Richard Rothstein observed and Heather Schwartz noted in Maryland, 
integrated housing breaks up patterns and pockets of concentrated poverty and has positive 
effects on children and schools. 26 

As for school resources, the Education Trust determined that children in low-income dis¬ 
tricts receive 10% less per pupil (or $1200) and children of color receive 15% less (or $2000) 
per year. 27 While Rebell has noted that 60% of the school funding court cases result in deci¬ 
sions in the plaintiffs favor, that does not mean they win a solution in the legislature. 28 In 
terms of compensatory funding, the national average is to provide an extra 19% of funding 
but some 70 adequacy studies from across the nation show that an additional 40% to 100% 
is needed. The needs of English learners show that 76% to 118% more is needed, depend¬ 
ing on the state. 29 Furthermore, in 2015, seven years after the recession, 31 states were still 
spending less than what they were in 2008. 30 

One effective compensatory approach is to combine social, educational and health services 
such as Valli et. al. outline. These approaches have proven successful (and are encouraged 
in ESSA), but they represent a difficult management conundrum as different domains and 
funding sources must live under one roof. 31 The Harlem Children’s Zone has been highlight¬ 
ed as a model of such interagency support and collaboration but has been subject to contro¬ 
versy regarding costs and questions about sustainability. 32 

One of the frequently heard phrases used to justify annual high-stakes disaggregated assess¬ 
ment is that “shining a light” on deficiencies of particular groups will prompt decision-mak¬ 
ers to increase funding, expand programs, and ensure high quality. This has not happened. 
Shining a light does not provide the social and educational learning essentials for our needi¬ 
est children. It merely establishes the convenient illusion of doing something productive (at 
little cost), while blaming the schools and the victims. It is an excuse for avoiding legal and 
moral obligations. 


B. High-stakes, test-based accountability does not improve learning. 

Since B.F. Skinner’s work sixty years ago, it has been repeatedly confirmed that negative 
reinforcement has unpredictable and undesirable consequences. While NCLB promised help 
for schools “in need of assistance,” this phrase became an Orwellian euphemism. Schools so 
classified were portrayed popularly and in the media as “failing” schools. Needless to say, 
the schools with the lowest scores tended to have the largest social and economic challenges. 

After their review of test-based accountability, the independent and prestigious National 
Academies reported that “...the measured effects to date tend to be concentrated in ele¬ 
mentary grade mathematics and the effects are small compared to the improvements the 
nation hopes to achieve.” 

The federal strategy under NCLB was based on four strategies (transformation, turnaround, 
restart, and closure). Under ESSA, the design of the accountability system devolves to the 
states. Given the prior investment of states in the federal models, coupled with the passage 
of enabling state laws and regulations, it is likely that many will continue to employ the same 
intervention strategies—at least over the short term. 

Throughout the past decade, a number of test-based accountability mechanisms have been 
tried with generally weak results: 
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Test-Based Teacher Evaluation 

This was the most popular of the turnaround strategies, employed by 74% of the schools in 
the now defunct School Improvement Grants. 33 

Evaluating teachers by test scores breaks down in several logical and empirical ways. First, 
students must be randomly assigned, which is demonstrably not the case in school practice. 
Some teachers teach remedial classes while others teach advanced placement students. Fur¬ 
ther, a given teacher could be (and has been) rated a success in one year and a failure in the 
next simply based on the students assigned. Second, the error rate inherent in this approach 
is so high that it simply precludes its use in high-stakes circumstances. 34 Third, there is no 
general teaching factor that is universally applicable to all cases. This renders the model in¬ 
valid for general application. Fourth, alternative explanations of gains (or losses) caused by 
factors outside the teacher’s control have typically not been properly considered. 35 

The use of value-added measures provoked the unusual response of a cautionary statement 
by the American Educational Research Association as well as a warning from the American 
Statistical Association. Their concerns are that VAM ratings are highly unstable, unduly in¬ 
fluenced by class composition, and do not disentangle the many other influences on student 
scores. 36 

While much good research continues, the use of this technique in broad scale, high-stakes 
circumstances is not warranted and raises compelling ethical questions. 


School Turnarounds 

The rationale is that under fear of being fired, teachers and principals will be motivated 
to improve student test scores. Used in at least 16% of the intervention cases, 37 the limited 
high-quality research in this area tells us that massive staff changes almost always harm 
rather than help struggling schools. The systemic disruption, decreased efficiency, human 
capital and organizational commitment losses argue against using such an approach. Turn¬ 
around schools must also have sufficient resources, time and adequate support structures 
to be effective and to attract and retain qualified personnel. 38 Replacing administrators and 
staff in urban and rural areas is a major obstacle and qualified people are often not avail¬ 
able. 39 

The research literature in this area is littered by (1) the pervasive use of advocacy and jour¬ 
nalistic “case studies” and (2) the abundance of unscientific “guides” setting forth unsub¬ 
stantiated general principles for successfully implementing turnarounds. 40 One researcher 
found a modest but positive 0.10 standard deviation gain for turnaround schools in Cali¬ 
fornia, but recognized the presence of added social services in the selected schools, smaller 
class sizes (five students less per class), and possible selection effects in his “fuzzy” regres¬ 
sion discontinuity model. He raised the questions of whether these gains were sustainable 
and whether the method is cost-effective at scale. 41 

All in all, there is no compelling evidence base showing the viability of this approach. 42 In 
reviewing the literature, AIR reported the somewhat universal finding that “success rates for 
school turnarounds are low and many such turnarounds are short-lived.” 43 
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Closing the School and Enrolling the Students in a Higher Performing School 

Limited research is available on this least common turnaround strategy. Some researchers 
find that school closures deepen divides and actually harm achievement while others show 
limited effects on test scores. 

In Chicago, a small proportion of displaced students who attended significantly higher per¬ 
forming schools mildly improved their test scores. However, 82% did not attend a higher 
performing school and, therefore, gleaned no such advantage. The problem, of course, is 
that higher performing schools are not always conveniently available. 44 Unfortunately, dis¬ 
placed students were less likely to attend summer schools and were more likely to transfer 
again. 45 Overall, following an initial drop, Chicago students did no worse and no better as a 
result of school closures. In Washington DC, the results were similar. 46 

Kemple found a similar pattern in studying 29 closed schools in New York City. The dis¬ 
placed students performed no better than before but newly incoming students did better 
than their predecessors, particularly in graduation rates. Nevertheless, this performance 
was quite low. Since closures were only one of a constellation of concurrent reforms, sorting 
out causality was problematic. Kemple concludes that closure of poorly performing schools 
was positive but a broader array of reforms is needed. 47 

In following the experiences of predominantly economically disadvantaged, high school 
children of color who were relocated through a school closure, Kirshner and his colleagues 
found relocated students registered lower test scores, lower graduation rates, increased 
dropouts, and increased signs of stress. These are unfortunate effects for a program de¬ 
signed to remedy, not aggravate, these very same problems. 48 

Overall, the early and limited research shows little to suggest that school closure is a prac¬ 
tical or effective vehicle for eliminating the achievement gap or for providing equality. The 
major strategies used in our persistently lowest performing schools, individually and collec¬ 
tively, failed to provide more than limited gains. In some cases, the effects were negative, 
particularly as they relate to equity goals. 


C. Privatizing schools has not produced across-the-board or meaningful 
learning gains. It leads to social segregation and is harmful to society. 

Reflected in federal policy in ESSA as well as NCLB is the underlying reform belief that a 
competitive market will solve school problems. Under NCLB, this predominantly took the 
form of charter schools. Fortunately, there is an abundant research literature available. The 
relevant question is “Can charter schools close the achievement gap?” Rhetorical and advo¬ 
cacy claims set aside, even the most optimistic findings provide little promise of achieving 
this goal. 


Achievement 

Perhaps the most prominent study of charter schools is the 27-state CREDO study. 49 It has 
been repeatedly cited by pro-charter advocacy organizations. The report states, “While much 
ground remains to be covered, charter schools in the 27 states are outperforming their TPS 
peer schools in greater numbers than in 2009.” However, this carefully crafted sentence 


http://nepc.colorado.edu/publication/lessons-from-NCLB 11 of 26 



obscures the finding that gains were only found in reading. There was no difference in math 
scores. Reviewers then noted: 

“...the study overall shows that less than one-hundredth of one percent of the 
variation in test performance is explainable by charter school enrollment. With 
a very large sample size, nearly any effect will be statistically significant, but in 
practical terms these effects are so small as to be regarded, without hyperbole, 
as trivial.” 50 

In an exhaustive meta-evaluation of the charter school research, Miron and Urschel conclud¬ 
ed, “...cumulative results from charter school research indicate, that, on the whole, charters 
perform similarly to traditional public schools. 51 


Segregation 

More troubling than the lack of gains in test scores is the mounting evidence that charter 
schools segregate students by race, income, language, and handicap. 52 This is a particularly 
problematic finding for a law whose express purpose is to advance equality and close per¬ 
formance gaps. 

In examining the counter-claim that charter schools do not segregate, Miron and his col¬ 
leagues observed “While the aggregate percentage of minority students in charter schools 
is similar to that of the sending districts...Charter school enrollment tends to fall into a 
bimodal distribution, with either high concentration minority or high concentration white. 
(T)hree quarters of the charter schools were either segregative white, segregative black, or 
segregative Hispanic. 53 


III. Lessons for State Accountability Systems 

While the concept of accountability has been a necessary feature for as long as we have pro¬ 
vided universal public education, policymakers have struggled to find a successful approach 
given the broad and changing purposes of education. The use of student testing, with pub¬ 
lished teacher and student test scores, can be traced to the 1870s. 54 

Despite the prescriptiveness of NCLB, considerable variation in school approval systems 
took place as a result of the federal waiver process and other interpretations in the law. 55 
Nonetheless, much of this variation is in small points. The core of test-based accountability 
remains. 56 

The new ESSA law, while erasing waivers, provides some additional latitude in the number 
and types of measures states and local districts may use. Yet, the state-designed accountabil¬ 
ity systems are still subject to federal approval. 

Two paramount issues require consideration: (a) combining multiple measures in such a 
way as to validly reflect the goals and purposes of education, which includes inputs as well as 
outputs; and (b) developing assessment systems beyond traditional tests and empirical in¬ 
dicators that consider necessary climate and cultural features of a sound education system. 
These will require different mechanisms such as inspectorate systems, self-evaluations and 
site visits conducted by qualified disinterested visitors representing the state or an accred- 
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itation group. 


What is to be Measured: Multiple Measures 

The ESSA requires “substantial weight” be placed on a combination of four variables (aca¬ 
demic achievement, student growth, graduation rates and English proficiency). At least one 
“school quality” indicator must be included and 95% of students must be tested. 57 The school 
quality indicators allow states (and in turn districts and schools) to bring in a whole variety 
of measures beyond standardized tests. While these might be limited in how much they can 
be weighted, they can nevertheless be placed on the public table and included. 

This is a positive development in that a more comprehensive set of measures will be more 
likely to validly capture the broader set of cognitive and affective learning goals of school¬ 
ing. 58 Unfortunately, “multiple measures” is an elastic term that includes an eclectic variety 
of elements. Depending upon the speaker and whatever pre-existing data are at hand in a 
given state, the term can mean many different things and thus result in many different pol¬ 
icy approaches. 

This elasticity is exemplified in its use as a bridging concept between dramatically different 
policy camps. Linda Darling-Hammond and Paul Hill, for instance, released coordinated 
reports addressing elements to be included in the next generation of school evaluation sys¬ 
tems. 59 Agreement on what should be measured has been characterized by vague generalities 
such as the need for assessment of “college and career ready” standards, the use of evalua¬ 
tion consequences at the school level, that outside intervention be required and available, 
the proper role of government and the like. These agreements are at such a high level of 
abstraction that “multiple measures” is more a rhetorical consensus than a verifiable ac¬ 
countability model. 

In looking at the pre ESSA federal “waivers,” 24 of 27 applying states proposed a wide va¬ 
riety of multiple measures. 60 In 2009, individual states identified from four to 22 different 
measures, which were characterized by a strong collection of outcome measures and a virtu¬ 
al absence of opportunity, input, or process measures. 61 

In order to have consistency across schools, the proposed “dashboards” are composed almost 
exclusively of empirical measures with data elements such as truancy, graduation rates, and 
disciplinary referrals. 62 These have the advantage of being highly reliable because they have 
a standard meaning across schools. But their validity, as a measure of school quality, is open 
to question. 

ESSA requires a composite “report card” grade be constructed from a broad array of com¬ 
mon measures. It also requires a single score for each school. Both of these concepts have 
difficulties. 

For multiple measures, a particular problem is the assignment of weights to the various 
measures. 63 For example, can 70% passing a math test be added to a 10% decrease in disci¬ 
plinary referrals, and should this be adjusted for socioeconomic factors and school history? 
While a number of statistical techniques (such as factor analysis) show promise for address¬ 
ing these concerns, current decisions appear to be based on the judgment of individuals or 
working groups. 64 Schools do not have a single purpose and there is no rational way of de¬ 
veloping a composite that will be uniformly satisfactory. Deciding on what measures will be 
used and how they will be combined is one of the most critical decisions states may make. 
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Whether they are narrow and limited or multidisciplinary and higher order will determine 
schools’ goals and directions. Carefully developed, the selection, use and publicity given to 
non-academic indicators can provide states with a more valid and accurate picture of their 
state’s educational system. 


How it is Measured: School Self-Evaluations and Inspection Teams 

While empirical outcome measures are required for the “substantial weight” in the federal 
evaluation system, and despite the problems noted directly above, that does not mean that 
other vital information should be ignored by state and local officials. A simple expedient 
may be to provide the federal government with their data while simultaneously conducting 
a more useful process within each state. 

While eclipsed by test-based models in the United States, self-evaluations (frequently com¬ 
bined with inspectorate systems) continue to be the norm in most OECD countries. The 
closest U.S. parallels are regional accreditation organizations that guide self-evaluations 
and organize visiting teams. The method is particularly used in higher education. Basically, 
the school conducts a structured self-evaluation. Then, in many cases, a visiting review team 
validates the self-evaluation report. 65 Through interviews and data review, the team seeks 
to verify such non-quantifiable yet vital things as express student expectations, the com¬ 
prehensiveness of assessments, curricular adequacy, professional development, available 
supports and the quality of interventions for high-needs children. 66 

Thus, school evaluations can be broader and more inclusive, and are less likely to distort 
school goals for teaching and learning. Also, a self-evaluation can be more revealing of needs 
than a staged show for visitors. 67 

Yet, such self-evaluations are no panacea. “Despite its long history and ubiquity, inspection 
has existed until comparatively recently in an a-theoretical limbo with practices and pro¬ 
cedures assessed on little more than the commonsense of those who commend or criticize 
them.” 68 The evaluation problem is that cause and effect are hard to nail down. 69 For exam¬ 
ple, did the new textbooks recommended by the team result in better teaching and learning? 
Would the school have purchased the materials anyway? One clear finding, however, is that 
interviews of participants show a positive view of self-evaluations and inspectorates, with 
90% of Great Britain principals and teachers reporting satisfaction with the system. 70 


IV. Recommendations for Policymakers and School Practice 

It is a daunting, if not impossible, task to reduce all major ESSA decisions to a short set of 
recommendations. Nonetheless, it is a necessary task. The recommendations are grouped 
by level, beginning with state policies, then assessment and accountability systems, instruc¬ 
tional improvement and, finally, effective school practices. 


A. State Policy 

• Above all else, each state must assure that students have adequate opportunities, 
funding and resources to achieve state goals. Funds must be available in an equi- 
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table manner and must be sufficient to meet students’ needs. Schools and school 
personnel must not be evaluated on elements where they are denied the resources 
and supports they need to be successful. 71 

• States must shift toward an assistance role and exercise less of a regulatory role. 
States must assure that all students have equal access to high-quality teachers, 
stimulating curriculum and instruction and adequate school resources (such as 
computers, libraries, field trips, and learning resources). 72 

• Under ESSA, school performance will now be measured using a system that in¬ 
corporates one or more non-academic indicators—chosen separately by each state. 
These non-academic indicators provide states their strongest new tool for maxi¬ 
mizing educational equity and opportunity and bringing attention to the nation’s 
broader educational purposes. 

• States and districts must collaborate with social service and labor departments 
to ensure adequate personal, social and economic opportunities. Without a liv¬ 
able wage and adequate support services, social problems will be manifest in the 
schools. Public and private schools must adopt assignment policies and practices 
that ensure integration and that disperse pockets of poverty. 73 


B. Assessment and Accountability 

• Charter schools should not be expanded, and state caps on their approval should 
be reduced. On average, charter schools do not perform at higher levels than public 
schools, yet they segregate, 74 remain prone to fiscal mismanagement, 75 and often 
have opaque management and accountability. 76 

• Development of multiple-measure and dashboard accountability approaches must 
be comprehensive, balanced between inputs and outcomes, expressed clearly, and 
assessed. As contrasted with a convenient collection of available data, the informa¬ 
tion must accurately and validly reflect the desired learning outcomes and the input 
resources needed to achieve these outcomes. 

• Standardized test scores must be used cautiously and only in combination with 
other data to avoid creating incentives for narrowed and distorted teaching and 
learning. Further, the weak technical strength of standardized assessments and 
value-added models renders these approaches invalid for use in a high-stakes con¬ 
text. 77 

• The aggregation of data into a single score or grade should be avoided. Such proce¬ 
dures hide valuable information while invalidly combining disparate and unrelated 
objects. 78 


C. Instructional Improvement 

• States and school districts must train educators to conduct formative and construc¬ 
tive self-evaluations. The current emphasis on outcome-based evaluations does not 
capture the diverse universe of teaching. 
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• States should establish, develop, train and implement school visitation teams that 
address both quantitative and qualitative factors. Sites most in need of improve¬ 
ment should be prioritized. Standardized test scores can be validly used to establish 
initial priorities. 79 

• External reviews should focus on providing guidance and capacity-building support 
for school development and improvement, rather than on imposing sanctions. 80 

• External reviewers should be qualified experts who meet prescribed standards. Ro¬ 
bust training should be compulsory, with retraining required on a periodic basis. 

• Multiple stakeholders (administrators, teachers, students, parents, community 
leaders, and researchers) should be involved in the design of the state’s evaluation/ 
inspectorate program. 

• States should use the flexibility and the assessment pilot project alternatives in 
ESSA to test fewer grades. If local assessments are employed in the remaining 
grades, avoid attempting to equate different tools or develop growth scores with a 
potpourri of different instruments. The technique does not have sufficient technical 
power to justify such usage. 81 

• States and districts must apply more stringent criteria in adopting interventions. 
Many commercial presentations, packages, and “best practices” lack a scientific 
foundation. Peer-reviewed literature must be employed to vet promising practices. 

A number of positive elements have also been illuminated and represent wise educational 
investments. The following five approaches are among the most important but should not be 
viewed as a complete or exhaustive list. 

• Early education - The achievement gap is already a standard deviation wide by 
age three or four and does not decrease as children go through school. Thus, the 
imperative is high-quality early education, which also has one of the highest rates 
of “return on investment.” Early education should concentrate on broad-based ex¬ 
periential learning. An emphasis on subject matter knowledge and formal assess¬ 
ment should be avoided until grades three or four. 82 

• Extended school year and day - Expanding learning time and using that addi¬ 
tional time for deep, engaging enrichment, either after school or during the sum¬ 
mer can be effective in closing the achievement gap. Again, the emphasis must be 
on high quality and comprehensive programs as contrasted with low substance 
and test prep approaches. 83 

• De-tracking - Tracking or “ability grouping” stratifies the learning opportunities 
of students inside of a school building, often segregating by race, ethnicity and so¬ 
cioeconomic status, thereby denying the most marginalized students a high-quali¬ 
ty education. 84 

• Class size reduction -Smaller class sizes show great advantages, helping teach¬ 
ers teach and helping students learn, but these reforms invariably are revisited in 
times of fiscal constraint. 85 

• School-community partnerships - Particularly for children who live in places 
where stable housing, employment and other opportunities are largely denied, the 
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provision of health, social, medical and dental support becomes essential. 86 It is 
also particularly important for schools in these communities to develop strong, 
mutually respectful partnerships with parents and other community members. 


V. The Moral Imperative: Adequate Inputs and the Opportunity Gap 

Laws for the encouragement of virtue and the prevention of vice and immo¬ 
rality ought to be constantly kept in force, and duly executed; and a compe¬ 
tent number of schools ought to be maintained . . . 

Vermont Constitution, 1777 

In these eighteenth century words, “ought” meant “shall.” “Virtue” meant civic virtue and 
contributing to your society, while vice meant actions antithetical to the common good. 
Within this phrase lies the purpose of education in a democratic society. No words are spent 
on international threats or on “being competitive in a global economy.” Though often par¬ 
roted, these latter rationales have only a dubious connection to either education or the econ¬ 
omy. 87 

The nation has become a majority of minorities and the common good requires all students 
to be well educated. Yet, we have embarked on economic and educational paths that system¬ 
atically privilege only a small percentage of the population. 88 In education, we invest less 
on children of color and the economically impoverished. 89 At the same time, we support a 
testing regime that measures wealth rather than provides a rich kaleidoscope of experience 
and knowledge to all. 

And we do not hold ourselves responsible for the basic denial of equal opportunities. 

[I]f schools are being held accountable for improving teaching and student 
learning, policymakers at all levels of the educational system, regional and 
state levels as well as the national level, should also be expected to support the 
capacity required to produce improved teaching and learning (p. 21). 90 

The greatest conceptual and most damaging mistake of test-based accountability systems 
has been the pretense that poorly supported schools could systemically overcome the effects 
of concentrated poverty and racial segregation by rigorous instruction and testing. 91 This 
system has inadequately supported teachers and students, has imposed astronomically high 
goals, and has inflicted punishment on those for whom it has demanded impossible achieve¬ 
ments. It stands in stark contrast to what Lyndon Johnson envisioned over fifty years ago 
when lawmakers first crafted the Elementary and Secondary Education Act. 

Public schools can only succeed in achieving their democratic purpose of educating all chil¬ 
dren with all-around support and accountability. 92 This means holding state and federal 
governments accountable for ensuring that children have legitimate, adequate and equitable 
opportunities to learn. Ultimately, a child denied opportunities will arrive at school with 
high needs, and a school without adequate resources cannot effectively address those needs. 
No amount of testing and improvement plans can succeed absent a strong support system. 

In a nation that prides itself on its achievements, the lack of opportunities provided to our 
neediest children is not morally justifiable. If we earnestly want to grasp the slipping-away 
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American Dream, we must invest simultaneously in our economy, our society and our 
schools. 
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